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We analyze the spreading of viruses in scale-free networks with high clustering and degree correla- 
tions, as found in the Internet graph. For the Suscetible-Infected-Susceptible model of epidemics 
the prevalence undergoes a phase transition at a finite threshold of the transmission probability. 
Comparing with the absence of a finite threshold in networks with purely random wiring, our result 
suggests that high clustering and degree correlations protect scale-free networks against the spread- 
ing of viruses. We introduce and verify a quantitative description of the epidemic threshold based 
on the connectivity of the neighborhoods of the hubs. 
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The description of the properties of several real net- 
works has manifested that, despite their different nature, 
they share some common features [Q-Q. They typically 
show a scale-free distribution of degree, high clustering, 
and a short average pathlength || . Although their topo- 
logical properties have been studied in detail, a natural 
question that arises is the dynamical properties that re- 
sult from the different topologies Q . An example where 
the interaction network is crucial for the dynamics is the 
case of disease spreading. The study of complex net- 
works as models of social, technological and biological 
interaction has been shown to give valuable insights of 
how viruses, diseases and rumours spread 

Most of these investigations have been performed as- 
suming networks with homogeneous connectivity, where 
all individuals have approximately the same number (de- 
gree) of contacts with others. The network is typically 
modeled as a regular lattice, a random graph, or a super- 
position of these two |l| . For such topologies the num- 
ber of infected individuals undergoes a phase transition: 
The single contact transmission probability needs to ex- 
ceed a critical threshold for a disease to become epidemic 
0,0. Recently, however, it has been discovered that 
many networks involved in the spread of diseases have a 
scale-free distribution of degree with a regime of power 
law decay. In particular, the web of human sexual con- 
tacts |l2|], the web of electronic mail communication fl3| l 
all contain highly connected individ- 
led hubs, which had been disregarded 
by the assumption of homogeneous connectivity in pre- 
vious works. The first model studies of disease spread 
in scale-free networks including hubs have revealed the 
absence of an epidemic threshold. Therefore it has been 
claimed that in technological and sociological networks 
even viruses with extremely low transmission probabil- 
ity can spread, and any prophylactic strategies aiming at 
a reduction of the average infectiveness would never re- 
sult in a total eradication of a prevalent virus. However, 
the alarming predictions have been obtained assuming 
random mixing. Apart from the scale-free degree dis- 
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tribution, all non-trivial topological properties of real- 
world sociological and technological networks have been 
neglected. 

This Letter is dedicated to the analysis of virus spread- 
ing in networks with local structure. In order to account 
for the large clustering coefficient and the presence of 
degree correlations JT4| we model the potentially infec- 
tive contacts by highly clustered scale- free networks [|l5| . 
We find that the single-contact transmission probability 
needs to exceed a finite threshold for a virus to spread 
and prevail. Thus the behavior of epidemics is quali- 
tatively different in highly clustered scale-free networks 
as compared with randomly wired scale- free networks. 
We conjecture that the difference can be explained by 
the presence or absence of connections between the hubs. 
Based on this conjecture, we define a new quantity, the 
secondary reproductive number, which predicts the epi- 
demic threshold for highly clustered and randomly wired 
scale-free networks, as well as for the Internet graph as 
an example of a real- world scale- free network MM . 

We consider the susceptible-infected-susceptible (SIS) 
model, as a simple description of epidemic spreading in 
a population [pL0|| . Each individual in the population is 
either infected or susceptible at any point in time. The 
potential infection pathways are described by interpret- 
ing the individuals as the nodes of a network. The time- 
discrete dynamics is defined by synchroneously updating 
the states of all individuals with the following rules: If 
individual A is infected at time t — 1, it is susceptible 
at time t. If, otherwise, individual A is susceptible at 
time t — 1 and is connected to at least one individual 
infected at the same time, then with probability A indi- 
vidual A is infected at time t. An important observable 
is the prevalence p. It is the time average of the fraction 
of infected individuals reached after a transient from the 
initial condition. Given a network, the only parameter of 
the model is the infection probability A. The information 
on the global spreading of a disease is contained in the 
function p(A). 

The individuals are connected by highly clustered 
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scale- free networks [JL5l. They are constructed by itera- 
tively adding nodes and links in the following way: Gen- 
erate a new node and connect it with all active nodes. 
Set the new node active as well. Set inactive one of the 
active nodes. The probability for deactivating node i is 
inversely proportional to its current number of links ki . 
Close the iteration loop by generating the next new node 
and so forth, until the network size reaches the desired 
value N. Starting from an initial network of m fully 
interconnected active nodes, a network with an average 
degree (k) = 2m links per node is generated. The de- 
gree distribution follows a power law P(k) = 2m 2 k~ 3 , 
and the clustering coefficient C = 5/6. Note that the 
deactivation mechanism mentioned here is part only of 
the growth mechanism of the network. It is not related 
to the dynamics of the SIS model which is applied after 
the network has been constructed. 

By extensive simulations we have obtained the preva- 
lence p(X) for populations of N = 10 5 individuals con- 
nected by highly clustered scale- free networks. In Fig. |l| 
we plot the fraction of infected individuals in the station- 
ary state, p, for different values of the average connectiv- 
ity. Only when A is increased above a value A c a signifi- 
cant prevalence is observed. The effect of the topological 
properties of the highly clustered scale-free networks be- 
comes clear when comparing the shape of the prevalence 
curves with those obtained for randomly wired scale-free 
networks. In the latter case no change of behavior is 
apparent as the prevalence and its slope vary smoothly 
when A is increased. 

Further insight into the behavior of epidemics in highly 
clustered scale-free networks is gained from the time evo- 
lution of the survival probability P s shown in the inset 
of Fig. [I]. Taking initial conditions with exactly one ran- 
domly chosen site infected, P s (i) is the fraction of re- 
alizations that contain at least one infected site after t 
time steps. For values of A well below the threshold A c 
the disease dies out exponentially whereas for A above A c 
the survival probability P s approaches a non-zero plateau 
value. The change of behavior from rapid eradication to 
non-zero prevalence is observed at a finite value of the 
transmission probability, independent of the system size. 
Thus the prevalence of the SIS model in highly clustered 
scale-free networks undergoes a phase transition at a fi- 
nite critical value A c of the transmission probability. In 
other words, viruses with a low transmission probability 
do not prevail in these networks. 

In order to understand the role played by the topology 
we consider the average connectivity of the neighbors of 
a node i 




where kj is the degree of node j and the neighborhood of 
node i (i.e. , the set of nodes directly connected to node 
i) is called V. 

The structure of the highly clustered scale-free net- 



works gives rise to correlations between the degree of a 
node and the degrees of its neighbors (see Fig. ||). For 
weakly connected nodes, (k nn ) decays. For the hubs, 
{;> (fc), it reaches a constant value |17j 

(*r> = w 1 • (2) 

In order to unify the treatment of random and struc- 
tured networks we also calculate (k nn ) for random scale- 
free networks. If P c (k'\k) is the conditional probability 
that a link belonging to a node with connectivity k points 
to a node with connectivity k', then 

{ kn = y: k'p c (k'\ k ) = y: { ^p(k') = ^ , 0) 

where we have used P c (k'\k) oc k'P(k') for random net- 
works. Now we specifically consider randomly wired 
scale-free networks with the degree distribution P(k) = 
2m 2 k~ 3 , the same distribution as in the highly clustered 
scale-free networks considered before. The networks are 
generated using the algorithm introduced in Ref. Q . Or- 
dering the nodes with respect to decreasing degree, every 
node is identified by its rank i. The degree of node i is 
given by 

Inserting (k 2 ) = T,Zi k f( N ) = (k) 2 /4lnN + OiN' 1 ) 
into Eq. (g) we obtain 

(k™) = &lnN, (5) 

independent, on average, of the node under considera- 
tion. This independence is confirmed numerically, see 
Fig. H It reflects the absence of correlations in the con- 
nectivity. Figure ^ shows the logarithmic dependence of 
(k nn ) on system size, in contrast with the constant value 
obtained for the hubs in the structured (highly clustered 
scale-free) networks. 

Now the different connectivity of the hubs in the highly 
clustered and random scale-free networks (both having 
the same degree distribution) is clear: Whereas in the 
random case a hub is connected to other highly connected 
nodes, in the highly clustered networks the hubs are al- 
most exclusively connected to low degree nodes. This 
difference will result essential for the epidemic dynamics. 

But how is this topological property related to the 
transmission threshold found of the SIS model? Let us 
define the secondary reproductive number as 

R 2 = X(kD ■ (6) 

We show below that the condition R% = 1 recovers a pre- 
vious prediction for the epidemic threshold in randomly 
wired networks networks, and gives a good estimate for 
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the highly clustered scale-free networks and the Internet 
graph. 

Previously, the behavior of the epidemics has been de- 
scribed in terms of the basic reproductive number, Rq 
Jl9| . It is defined as the average number of secondary in- 
fections produced by an infectious individual in a totally 
susceptible population and indicates whether a disease 
can ever invade a population. For random networks with 
broad degree distribution, the basic reproductive number 
is given by 
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(7) 



Only if Rq is larger than unity the infection prevails. 
Employing Eq. (|^) we find Rq = R2, such that in ran- 
domly wired networks the basic and secondary reproduc- 
tive number coincide. Therefore the condition R2 — 1 
recovers the standard prediction of the epidemic thresh- 
old used in epidemiology, assuming random mixing of the 
population. 

For the highly clustered scale-free networks, applying 
the condition R2 = 1 and using Eq. predicts a threshold 



contacts between individuals or nodes. We have conjec- 
tured that the value of the threshold is related to the de- 
gree correlations in the network, such that the product of 
the transmission probability A and the mean connectivity 
(k% n ) of the neighbors of the hubs needs to exceed unity 
for the epidemic to prevail. This criterion holds precisely 
for highly clustered scale-free networks. For randomly 
wired scaele-free networks it coincides with the standard 
prediction in epidemiology given by the basic reproduc- 
tive number. The transmission probability required for 
spreading on the real Internet graph is approximated well 
by our criterion, whereas the basic reproductive number 
drastically underestimates the value. 

The existence of an epidemic threshold in highly clus- 
tered scale-free networks contrasts with the result for 
randomly wired networks, where arbitrarily weak viruses 
show finite prevalence. This suggests that the spreading 
of viruses in networks with scale-free degree distribution 
may be suppressed by non-random wiring. In particular, 
degree correlations including the absence of direct con- 
nections between highly connected nodes, may provide 
protection against epidemics. 



A R = 
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The onset of non-zero prevalence found numerically (Fig. 
|l|) is in good agreement with the prediction. Note that for 
the highly clustered scale-free networks in general R2 ^ 
i?o- In particular, Rq diverges with system size ./V as In TV 
leading to a false prediction of A c = in the limit of large 
highly clustered scale-free networks. 

In order to check the applicability of the secondary re- 
productive number to empirical networks we investigate 
the Internet graph. We simulate the SIS model in the net- 
work of the Autonomous Systems at three different time 
stages of its evolution [^o| . Figure || shows the prevalence 
of the SIS model as a function of the transmission prob- 
ability. The threshold values predicted by the condition 
1 = A c (/cJ i m ) give a good estimate of the minimum trans- 
mission rate above which the disease spreads. However, 
using the basic reproductive number instead (Eq. (^)), 
gives threshold values 0.012, 0.009, 0.007 for years 1997, 
1998, 2000, respectively. This understimates the thresh- 
old found in the simulations by at least one order of mag- 
nitude. Similar to the highly clustered scale-free net- 
works the Internet graph displays considerable degree 
correlations The mean connectivity in the neigh- 

borhoods of the hubs is much lower than expected for 
random wiring. This explains the failure of the descrip- 
tion by the basic reproductive number which neglects the 
strong correlations. The secondary reproductive number, 
however, gives a satisfactory prediction. 

We have shown the existence of a finite epidemic 
threshold in highly clustered scale-free networks in the 
limit of inifinite system size. Our study has considered 
for the first time scale-free networks with realistic topo- 
logical properties as a model for the potentially infective 
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FIG. 1. Prevalence p (fraction of infected individuals in 
the stationary state) as a function of the spreading rate A for 
highly clustered scale-free networks, with (k) = 4 (circles), 
6 (squares), and 10 (diamonds), and for random scale- free 
networks with (k) = 6 (solid curve). The simulations have 
been run in networks containing 10 J nodes and averaging over 
100 different realizations. Inset: Survival probability, P 3 , for a 
localized infection after t time steps. Parameter values (from 
bottom to top) A = 0.15, 0.18, 0.2, 0.22, 0.25; and (k) = 6. 
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FIG. 2. Average degree of the neighbors of a node with 
connectivity k in the structured networks with (k) = 4 (cir- 
cles), 6 (open squares), 10 (diamonds). The asymptotic values 
for large k are 3.0 ± 0.1, 5.1 ± 0.3, and 9 ± 1 to be compared 
with the theoretical prediction (k^ n ) = (fe) — 1 = 3, 5, and 
9 respectively (cf. Eq. 3). The filled squares is the average 
degree of the neighbors in random scale-free networks with 
(k) = 6. 




system size N 

FIG. 3. Dependence of the average degree of the neighbors 
of a node with system size N. For the casse of highly clustered 
scale-free networks, the value has been obtained averaging for 
nodes with k > 1000. The theoretical predictions ((fc) — 1) 
is also plotted (dashed-dotted line). For the case of the BA 
networks, the values are the average over the full range of 
available connectivities. The theoretical prediction (fc)/4 In N 
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FIG. 4. Prevalence p as a function of the spreading rate 
A for the Internet graph at three different times. The large 
filled symbols indicate the trasmission threshold calculated 
according to the secondary reproductive number (Eq. (@)). 
The value of (k^ n ) has been obtained as an average over the 
two largest hubs. 
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